Time Hopping technique for faster reinforcement learning in simulations

نویسندگان

Petar Kormushev

Kohei Nomoto

Fangyan Dong

Kaoru Hirota

چکیده

A technique called Time Hopping is proposed for speeding up reinforcement learning algorithms. It is applicable to continuous optimization problems running in computer simulations. Making shortcuts in time by hopping between distant states combined with off-policy reinforcement learning allows the technique to maintain higher learning rate. Experiments on a simulated biped crawling robot confirm that Time Hopping can accelerate the learning process more than seven times.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eligibility Propagation to Speed up Time Hopping for RL Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

A mechanism called Eligibility Propagation is proposed to speed up the Time Hopping technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hopping similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph....

متن کامل

Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

متن کامل

Time Hopping Technique for Reinforcement Learning and its Application to Robot Control

To speed up the convergence of reinforcement learning (RL) algorithms by more efficient use of computer simulations, three algorithmic techniques are proposed: Time Manipulation, Time Hopping, and Eligibility Propagation. They are evaluated on various robot control tasks. The proposed Time Manipulation [1] is a concept of manipulating the time inside a simulation and using it as a tool to speed...

متن کامل

Probability Redistribution using Time Hopping for Reinforcement Learning

A method for using the Time Hopping technique as a tool for probability redistribution is proposed. Applied to reinforcement learning in a simulation, it is able to re-shape the state probability distribution of the underlying Markov decision process as desired. This is achieved by modifying the target selection strategy of Time Hopping appropriately. Experiments with a robot maze reinforcement...

متن کامل

Decomposition of Reinforcement Learning for Admission Control of Self-Similar Call Arrival Processes

This paper presents predictive gain scheduling, a technique for simplifying reinforcement learning problems by decomposition. Link admission control of self-similar call traffic is used to demonstrate the technique. The control problem is decomposed into on-line prediction of near-future call arrival rates, and precomputation of policies for Poisson call arrival processes. At decision time, the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/0904.0545 شماره

صفحات -

تاریخ انتشار 2009

Time Hopping technique for faster reinforcement learning in simulations

نویسندگان

چکیده

منابع مشابه

Eligibility Propagation to Speed up Time Hopping for RL Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

Time Hopping Technique for Reinforcement Learning and its Application to Robot Control

Probability Redistribution using Time Hopping for Reinforcement Learning

Decomposition of Reinforcement Learning for Admission Control of Self-Similar Call Arrival Processes

عنوان ژورنال:

اشتراک گذاری